209 research outputs found

    Compositional Falsification of Cyber-Physical Systems with Machine Learning Components

    Full text link
    Cyber-physical systems (CPS), such as automotive systems, are starting to include sophisticated machine learning (ML) components. Their correctness, therefore, depends on properties of the inner ML modules. While learning algorithms aim to generalize from examples, they are only as good as the examples provided, and recent efforts have shown that they can produce inconsistent output under small adversarial perturbations. This raises the question: can the output from learning components can lead to a failure of the entire CPS? In this work, we address this question by formulating it as a problem of falsifying signal temporal logic (STL) specifications for CPS with ML components. We propose a compositional falsification framework where a temporal logic falsifier and a machine learning analyzer cooperate with the aim of finding falsifying executions of the considered model. The efficacy of the proposed technique is shown on an automatic emergency braking system model with a perception component based on deep neural networks

    A Machine Learning Trainable Model to Assess the Accuracy of Probabilistic Record Linkage

    Get PDF
    Record linkage (RL) is the process of identifying and linking data that relates to the same physical entity across multiple heterogeneous data sources. Deterministic linkage methods rely on the presence of common uniquely identifying attributes across all sources while probabilistic approaches use non-unique attributes and calculates similarity indexes for pair wise comparisons. A key component of record linkage is accuracy assessment — the process of manually verifying and validating matched pairs to further refine linkage parameters and increase its overall effectiveness. This process however is time-consuming and impractical when applied to large administrative data sources where millions of records must be linked. Additionally, it is potentially biased as the gold standard used is often the reviewer’s intuition. In this paper, we present an approach for assessing and refining the accuracy of probabilistic linkage based on different supervised machine learning methods (decision trees, naïve Bayes, logistic regression, random forest, linear support vector machines and gradient boosted trees). We used data sets extracted from huge Brazilian socioeconomic and public health care data sources. These models were evaluated using receiver operating characteristic plots, sensitivity, specificity and positive predictive values collected from a 10-fold cross-validation method. Results show that logistic regression outperforms other classifiers and enables the creation of a generalized, very accurate model to validate linkage results

    The leading digit distribution of the worldwide Illicit Financial Flows

    Full text link
    Benford's law states that in data sets from different phenomena leading digits tend to be distributed logarithmically such that the numbers beginning with smaller digits occur more often than those with larger ones. Particularly, the law is known to hold for different types of financial data. The Illicit Financial Flows (IFFs) exiting the developing countries are frequently discussed as hidden resources which could have been otherwise properly utilized for their development. We investigate here the distribution of the leading digits in the recent data on estimates of IFFs to look for the existence of a pattern as predicted by Benford's law and establish that the frequency of occurrence of the leading digits in these estimates does closely follow the law.Comment: 13 pages, 10 figures, 6 tables, additional data analyi

    Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set

    Get PDF
    There is an enormous amount of information encoded in each genome – enough to create living, responsive and adaptive organisms. Raw sequence data alone is not enough to understand function, mechanisms or interactions. Changes in a single base pair can lead to disease, such as sickle-cell anemia, while some large megabase deletions have no apparent phenotypic effect. Genomic features are varied in their data types and annotation of these features is spread across multiple databases. Herein, we develop a method to automate exploration of genomes by iteratively exploring sequence data for correlations and building upon them. First, to integrate and compare different annotation sources, a sequence matrix (SM) is developed to contain position-dependant information. Second, a classification tree is developed for matrix row types, specifying how each data type is to be treated with respect to other data types for analysis purposes. Third, correlative analyses are developed to analyze features of each matrix row in terms of the other rows, guided by the classification tree as to which analyses are appropriate. A prototype was developed and successful in detecting coinciding genomic features among genes, exons, repetitive elements and CpG islands

    Gender-Associated Genes in Filarial Nematodes Are Important for Reproduction and Potential Intervention Targets

    Get PDF
    Lymphatic filariasis is a neglected tropical disease that is caused by thread-like parasitic worms that live and reproduce in lymphatic vessels of the human host. There are no vaccines to prevent filariasis, and available drugs are not effective against all stages of the parasite. In addition, recent reports suggest that the filarial nematodes may be developing resistance to key medications. Therefore, there is an urgent need to identify new drug targets in filarial worms. The purpose of this study was to perform a genome-wide analysis of gender-associated gene transcription to improve understanding of key reproductive processes in filarial nematodes. Our results indicate that thousands of genes are differentially expressed in male and female adult worms. Many of those genes are involved in specific reproductive processes such as embryogenesis and spermatogenesis. In addition, expression of some of those genes is suppressed by tetracycline, a drug that leads to sterilization of adult female worms in many filarial species. Thus, gender-associated genes represent priority targets for design of vaccines and drugs that interfere with reproduction of filarial nematodes. Additional work with this type of integrated systems biology approach should lead to important new tools for controlling filarial diseases

    How local is local? Evidence from bank competition and corporate innovation in U.S.

    Get PDF
    This paper aims to fill in a research gap in the effects of bank competition on corporate innovation. In addition to the evidence on the favorable effects of bank competition on corporate innovation, we show novel evidence on the substitution effects of bank competition in a wider region and neighbor-state to local bank competition in financing corporate innovation activities. In banking market, we show ‘how local is local’ depends on the operating scope and information transparency of firms. Local banks have an information advantage over distant banks in financing local businesses and informationally opaque corporate innovation activities

    Exploring the Zoonotic Potential of Mycobacterium avium Subspecies paratuberculosis through Comparative Genomics

    Get PDF
    A comparative genomics approach was utilised to compare the genomes of Mycobacterium avium subspecies paratuberculosis (MAP) isolated from early onset paediatric Crohn's disease (CD) patients as well as Johne's diseased animals. Draft genome sequences were produced for MAP isolates derived from four CD patients, one ulcerative colitis (UC) patient, and two non-inflammatory bowel disease (IBD) control individuals using Illumina sequencing, complemented by comparative genome hybridisation (CGH). MAP isolates derived from two bovine and one ovine host were also subjected to whole genome sequencing and CGH. All seven human derived MAP isolates were highly genetically similar and clustered together with one bovine type isolate following phylogenetic analysis. Three other sequenced isolates (including the reference bovine derived isolate K10) were genetically distinct. The human isolates contained two large tandem duplications, the organisations of which were confirmed by PCR. Designated vGI-17 and vGI-18 these duplications spanned 63 and 109 open reading frames, respectively. PCR screening of over 30 additional MAP isolates (3 human derived, 27 animal derived and one environmental isolate) confirmed that vGI-17 and vGI-18 are common across many isolates. Quantitative real-time PCR of vGI-17 demonstrated that the proportion of cells containing the vGI-17 duplication varied between 0.01 to 15% amongst isolates with human isolates containing a higher proportion of vGI-17 compared to most animal isolates. These findings suggest these duplications are transient genomic rearrangements. We hypothesise that the over-representation of vGI-17 in human derived MAP strains may enhance their ability to infect or persist within a human host by increasing genome redundancy and conferring crude regulation of protein expression across biologically important regions

    Hypoxia and the Hypoxic Response Pathway Protect against Pore-Forming Toxins in C. elegans

    Get PDF
    Pore-forming toxins (PFTs) are by far the most abundant bacterial protein toxins and are important for the virulence of many important pathogens. As such, cellular responses to PFTs critically modulate host-pathogen interactions. Although many cellular responses to PFTs have been recorded, little is understood about their relevance to pathological or defensive outcomes. To shed light on this important question, we have turned to the only genetic system for studying PFT-host interactions—Caenorhabditis elegans intoxication by Crystal (Cry) protein PFTs. We mutagenized and screened for C. elegans mutants resistant to a Cry PFT and recovered one mutant. Complementation, sequencing, transgenic rescue, and RNA interference data demonstrate that this mutant eliminates a gene normally involved in repression of the hypoxia (low oxygen response) pathway. We find that up-regulation of the C. elegans hypoxia pathway via the inactivation of three different genes that normally repress the pathway results in animals resistant to Cry PFTs. Conversely, mutation in the central activator of the hypoxia response, HIF-1, suppresses this resistance and can result in animals defective in PFT defenses. These results extend to a PFT that attacks mammals since up-regulation of the hypoxia pathway confers resistance to Vibrio cholerae cytolysin (VCC), whereas down-regulation confers hypersusceptibility. The hypoxia PFT defense pathway acts cell autonomously to protect the cells directly under attack and is different from other hypoxia pathway stress responses. Two of the downstream effectors of this pathway include the nuclear receptor nhr-57 and the unfolded protein response. In addition, the hypoxia pathway itself is induced by PFT, and low oxygen is protective against PFT intoxication. These results demonstrate that hypoxia and induction of the hypoxia response protect cells against PFTs, and that the cellular environment can be modulated via the hypoxia pathway to protect against the most prevalent class of weapons used by pathogenic bacteria
    • …
    corecore